Practical Lab 2 - Data Visualization and Publication¶

This project showcases the capabilities of Matplotlib, Seaborn, and Plotly in visual storytelling for data analysis.

Matplotlib🔗¶

The Matplotlib Python library allows you to create visualizations and plots. It provides various tools and functions to create multiple graphs, charts, and plots. With Matplotlib, we can generate line plots, scatter plots, bar charts, histograms, and much more.

Data Science Salaries¶

This project is all about analyzing and visualizing hypothetical data that represents the average salaries for various data science roles over a span of five years. The data includes job titles such as Data Analyst, Machine Learning Engineer, Data Scientist, AI Researcher, and Senior Data Scientist.

  1. Importing libraries
In [87]:
import matplotlib.pyplot as plt
import pandas as pd
  1. Importing data from CSV and displaying data in table format.
In [88]:
df = pd.read_csv("fake_ds_salaries.csv")
df
Out[88]:
Year Data Analyst Machine Learning Eng. Data Scientist AI Researcher Sr. Data Scientist
0 2020 58000 82000 95000 105000 118000
1 2021 62000 87000 98000 112000 127000
2 2022 68000 91000 102000 117000 135000
3 2023 72000 96000 108000 125000 145000
4 2024 76000 102000 115000 133000 155000
  1. Plotting the lines on the graph which illustrate the salaries of different data science jobs over 5 years.
In [89]:
fig, ax = plt.subplots()
ax.plot(df["Year"],df["Data Analyst"], label="Data Analyst")
ax.plot(df["Year"],df["Machine Learning Eng."], label="Machine Learning Eng.")
ax.plot(df["Year"],df["Data Scientist"], label="Data Scientist")
ax.plot(df["Year"],df["AI Researcher"], label="AI Researcher")
ax.plot(df["Year"],df["Sr. Data Scientist"], label="Sr. Data Scientist")
ax.legend()
Out[89]:
<matplotlib.legend.Legend at 0x221e4998450>

Plotly🔗¶

The Matplotlib Python library allows you to create visualizations and plots. It provides various tools and functions to create multiple graphs, charts, and plots. With Matplotlib, we can generate line plots, scatter plots, bar charts, histograms, and much more.

Canada's GDP Trends Using Plotly¶

This project aim to analyze the economic trends of Canada over the last decade through the lens of its Gross Domestic Product (GDP).

  1. Importing libraries
In [90]:
import pandas as pd
import plotly.express as px
import plotly

plotly.offline.init_notebook_mode()
  1. Importing data from JSON and displaying data in table format.
In [91]:
df = pd.read_json('canada_gdp.json')
df
Out[91]:
year GDP
0 2014 1822.97
1 2015 1550.54
2 2016 1534.08
3 2017 1677.42
4 2018 1713.84
5 2019 1737.69
6 2020 1628.73
7 2021 1713.80
8 2022 1896.67
9 2023 2010.24
  1. Generating a line chart
In [92]:
fig = px.line(df, x="year", y="GDP",title="Canada GDP over last 10 years.")
fig.show()

Seaborn🔗¶

This project utilizes data visualization techniques to investigate and comprehend the interrelationships between economic and social indicators across diverse countries using Seaborn. The dataset encompasses information on GDP, literacy rates, and GDP per capita for a varied set of countries.

  1. Importing libraries
In [93]:
import seaborn as sns
sns.set_theme(style="white")
  1. Importing data from CSV and displaying data in table format.
In [94]:
df = pd.read_csv("fake_literacy_data.csv")
df.head()
Out[94]:
literacy_rate country_name GDP GDP_per_capita
0 91.0 Finland 44644.2 55349.8
1 99.0 Denmark 57637.5 75099.5
2 99.0 Switzerland 67789.2 84184.6
3 99.0 Iceland 52062.2 68489.3
4 95.0 Netherlands 55646.5 62549.7
  1. Generating a scatterd chart
In [95]:
sctplt = sns.relplot(x="GDP", y="country_name", hue="literacy_rate", size="GDP_per_capita",
            sizes=(100, 400), alpha=.7, palette="muted",
            height=8, data=df)
sctplt.set_ylabels("Country Name")
sctplt._legend.set_title("Literacy Rate and GDP per Capita")
plt.show()

Note: All data used to generate graphs are not real. It is only used for illustative purpose.

Libraries used in this project.¶

Libraries
Plotly Logo Matplotlib Logo Seaborn Logo